experience level
Randomized Controlled Trials for Conditional Access Optimization Agent
Bono, James, Cheng, Beibei, Lozano, Joaquin
AI agents are increasingly deployed to automate complex enterprise workflows, yet evidence of their effectiveness in identity governance is limited. We report results from the first randomized controlled trial (RCT) evaluating an AI agent for Conditional Access (CA) policy management in Microsoft Entra. The agent assists with four high-value tasks: policy merging, Zero-Trust baseline gap detection, phased rollout planning, and user-policy alignment. In a production-grade environment, 162 identity administrators were randomly assigned to a control group (no agent) or treatment group (agent-assisted) and asked to perform these tasks. Agent access produced substantial gains: accuracy improved by 48% and task completion time decreased by 43% while holding accuracy constant. The largest benefits emerged on cognitively demanding tasks such as baseline gap detection. These findings demonstrate that purpose-built AI agents can significantly enhance both speed and accuracy in identity administration.
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
Effect of Reporting Mode and Clinical Experience on Radiologists' Gaze and Image Analysis Behavior in Chest Radiography
Khoobi, Mahta, von der Stueck, Marc Sebastian, Ordonez, Felix Barajas, Iancu, Anca-Maria, Corban, Eric, Nowak, Julia, Kargaliev, Aleksandar, Perelygina, Valeria, Schott, Anna-Sophie, Santos, Daniel Pinto dos, Kuhl, Christiane, Truhn, Daniel, Nebelung, Sven, Siepmann, Robert
Structured reporting (SR) and artificial intelligence (AI) may transform how radiologists interact with imaging studies. This prospective study (July to December 2024) evaluated the impact of three reporting modes: free-text (FT), structured reporting (SR), and AI-assisted structured reporting (AI-SR), on image analysis behavior, diagnostic accuracy, efficiency, and user experience. Four novice and four non-novice readers (radiologists and medical students) each analyzed 35 bedside chest radiographs per session using a customized viewer and an eye-tracking system. Outcomes included diagnostic accuracy (compared with expert consensus using Cohen's $κ$), reporting time per radiograph, eye-tracking metrics, and questionnaire-based user experience. Statistical analysis used generalized linear mixed models with Bonferroni post-hoc tests with a significance level of ($P \le .01$). Diagnostic accuracy was similar in FT ($κ= 0.58$) and SR ($κ= 0.60$) but higher in AI-SR ($κ= 0.71$, $P < .001$). Reporting times decreased from $88 \pm 38$ s (FT) to $37 \pm 18$ s (SR) and $25 \pm 9$ s (AI-SR) ($P < .001$). Saccade counts for the radiograph field ($205 \pm 135$ (FT), $123 \pm 88$ (SR), $97 \pm 58$ (AI-SR)) and total fixation duration for the report field ($11 \pm 5$ s (FT), $5 \pm 3$ s (SR), $4 \pm 1$ s (AI-SR)) were lower with SR and AI-SR ($P < .001$ each). Novice readers shifted gaze towards the radiograph in SR, while non-novice readers maintained their focus on the radiograph. AI-SR was the preferred mode. In conclusion, SR improves efficiency by guiding visual attention toward the image, and AI-prefilled SR further enhances diagnostic accuracy and user satisfaction.
- Europe > Austria > Vienna (0.14)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Understanding Practitioners Perspectives on Monitoring Machine Learning Systems
Naveed, Hira, Grundy, John, Arora, Chetan, Khalajzadeh, Hourieh, Haggag, Omar
--Given the inherent non-deterministic nature of machine learning (ML) systems, their behavior in production environments can lead to unforeseen and potentially dangerous outcomes. For a timely detection of unwanted behavior and to prevent organizations from financial and reputational damage, monitoring these systems is essential. This paper explores the strategies, challenges, and improvement opportunities for monitoring ML systems from the practitioners' perspective. We conducted a global survey of 91 ML practitioners to collect diverse insights into current monitoring practices for ML systems. We aim to complement existing research through our qualitative and quantitative analyses, focusing on prevalent runtime issues, industrial monitoring and mitigation practices, key challenges, and desired enhancements in future monitoring tools. Our findings reveal that practitioners frequently struggle with runtime issues related to declining model performance, exceeding latency, and security violations. While most prefer automated monitoring for its increased efficiency, many still rely on manual approaches due to the complexity or lack of appropriate automation solutions. Practitioners report that the initial setup and configuration of monitoring tools is often complicated and challenging, particularly when integrating with ML systems and setting alert thresholds. Moreover, practitioners find that monitoring adds extra workload, strains resources, and causes alert fatigue. The desired improvements from the practitioners' perspective are: automated generation and deployment of monitors, improved support for performance and fairness monitoring, and recommendations for resolving runtime issues. These insights offer valuable guidance for the future development of ML monitoring tools that are better aligned with practitioners' needs. Machine Learning (ML) systems are being increasingly employed across various domains, including social media, e-commerce, and engineering - even critical domains such as finance, healthcare, and autonomous vehicles nowadays leverage ML to automate and enhance their services. Generative AI and Large Language Models (LLMs) have further boosted ML adoption by creating several new use cases [1], [2]. A typical ML system lifecycle begins by gathering requirements and preparing data, which is followed by the development of the ML component (experimentation, model training, and evaluation) and other traditional software components [3]. After development, the next step is integration and system testing. Once quality assurance is completed, the ML system is deployed to a production environment.
- Oceania > Australia (0.05)
- North America > United States (0.04)
- Asia > Pakistan (0.04)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Research Report > Experimental Study (0.93)
OSS-UAgent: An Agent-based Usability Evaluation Framework for Open Source Software
Meng, Lingkai, Shao, Yu, Yuan, Long, Lai, Longbin, Cheng, Peng, Yu, Wenyuan, Zhang, Wenjie, Lin, Xuemin, Zhou, Jingren
Usability evaluation is critical to the impact and adoption of open source software (OSS), yet traditional methods relying on human evaluators suffer from high costs and limited scalability. To address these limitations, we introduce OSS-UAgent, an automated, configurable, and interactive agent-based usability evaluation framework specifically designed for open source software. Our framework employs intelligent agents powered by large language models (LLMs) to simulate developers performing programming tasks across various experience levels (from Junior to Expert). By dynamically constructing platform-specific knowledge bases, OSS-UAgent ensures accurate and context-aware code generation. The generated code is automatically evaluated across multiple dimensions, including compliance, correctness, and readability, providing a comprehensive measure of the software's usability. Additionally, our demonstration showcases OSS-UAgent's practical application in evaluating graph analytics platforms, highlighting its effectiveness in automating usability evaluation.
- Asia > China > Shanghai > Shanghai (0.05)
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- Oceania > Australia > New South Wales (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
Dermatologist-like explainable AI enhances melanoma diagnosis accuracy: eye-tracking study
Chanda, Tirtha, Haggenmueller, Sarah, Bucher, Tabea-Clara, Holland-Letz, Tim, Kittler, Harald, Tschandl, Philipp, Heppt, Markus V., Berking, Carola, Utikal, Jochen S., Schilling, Bastian, Buerger, Claudia, Navarrete-Dechent, Cristian, Goebeler, Matthias, Kather, Jakob Nikolas, Schneider, Carolin V., Durani, Benjamin, Durani, Hendrike, Jansen, Martin, Wacker, Juliane, Wacker, Joerg, Consortium, Reader Study, Brinker, Titus J.
Artificial intelligence (AI) systems have substantially improved dermatologists' diagnostic accuracy for melanoma, with explainable AI (XAI) systems further enhancing clinicians' confidence and trust in AI-driven decisions. Despite these advancements, there remains a critical need for objective evaluation of how dermatologists engage with both AI and XAI tools. In this study, 76 dermatologists participated in a reader study, diagnosing 16 dermoscopic images of melanomas and nevi using an XAI system that provides detailed, domain-specific explanations. Eye-tracking technology was employed to assess their interactions. Diagnostic performance was compared with that of a standard AI system lacking explanatory features. Our findings reveal that XAI systems improved balanced diagnostic accuracy by 2.8 percentage points relative to standard AI. Moreover, diagnostic disagreements with AI/XAI systems and complex lesions were associated with elevated cognitive load, as evidenced by increased ocular fixations. These insights have significant implications for clinical practice, the design of AI tools for visual tasks, and the broader development of XAI in medical diagnostics.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.14)
- (10 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.93)
Study of software developers' experience using the Github Copilot Tool in the software development process
Jaworski, Mateusz, Piotrkowski, Dariusz
In software development there is a constant pressure to produce code faster and faster without compromising on quality. New tools supporting developers are created in response to this demand. Currently a new generation of such solutions is about to be launched - Artificial Intelligence driven tools. On 29 June 2021 Github Copilot was announced. It uses trained model to generate code based on human understandable language. The focus of this research was to investigate software developers' approach to this tool. For this purpose a survey containing 18 questions was prepared and shared with programmers. A total of 42 answers were gathered. The results of the research indicate that developers' opinions are divided. Most of them met Github Copilot before attending the survey. The attitude to the tool was mostly positive but not many participants were willing to use it. Concerns are caused by security issues associated with using of Github Copilot.
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.87)
Exclusive: OpenAI summarizes KDnuggets - KDnuggets
OpenAI has recently published an important work, focused on the alignment problem, the problem of ensuring that general-purpose AI and machine learning systems align with human intentions. The "Paperclip Maximizer" is a famous example of alignment gone wrong. To test scalable alignment methods, OpenAI trained a model to summarize entire books, as described in their blog on KDnuggets: Scaling human oversight of AI systems for difficult tasks – OpenAI approach. OpenAI model works by first summarizing small sections of a book, then summarizing those summaries into a higher-level summary, and so on. The results were pretty amazing, so we have asked OpenAI to summarize two top KDnuggets blogs from last year, and here are the summaries.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
Learn to Build Your own AI Chatbot (Make it Do Anything)
A Chatbot is becoming a mainstream tool that organizations use to streamline and automate communication. It's a hot skill whether your experience level is from a basic to advanced in python programming, chatbots and the mix of AI and machine learning make it incredibly versatile and inexpensive. Start here and learn the basics of how it works. A Chatbot is becoming a mainstream tool that organizations use to streamline and automate communication. It's a hot skill whether your experience level is from a basic to advanced in python programming, chatbots and the mix of AI and machine learning make it incredibly versatile and inexpensive. Start here and learn the basics of how it works.
Report: State of Artificial Intelligence in India - 2020
Artificial Intelligence or AI is a field of Data Science that trains computers to learn from experience, adjust to inputs, and perform tasks of certain cognitive levels. Over the last few years, AI has emerged as a significant data science function and, by utilizing advanced algorithms and computing power, AI is transforming the functional, operational, and strategic landscape of various business domains. AI algorithms are designed to make decisions, often using real-time data. Using sensors, digital data, and even remote inputs, AI algorithms combine information from a variety of different sources, analyze the data instantly, and act on the insights derived from the data. Most AI technologies – from advanced recommendation engines to self-driving cars – rely on diverse deep learning models. By utilizing these complex models, AI professionals are able to train computers to accomplish specific tasks by recognizing patterns in the data. Analytics India Magazine (AIM), in association with Jigsaw Academy, has developed this study on the Artificial Intelligence market to understand the developments of the AI market in India, covering the market in terms of Industry and Company Type. Moreover, the study delves into the market size of the different categories of AI and Analytics startups / boutique firms. As a part of the broad Data Science domain, the Artificial Intelligence technology function has so far been classified as an emerging technology segment. Moreover, the AI market in India has, till now, been dominated by the MNC Technology and the GIC or Captive firms. Domestic firms, Indian startups, and even International Technology startups across various sectors have, so far, not made a significant investment, in terms of operations and scale, in the Indian AI market. Additionally, IT services and Boutique AI & Analytics firms had not, till a couple of years ago, developed full-fledged AI offerings in India for their clients.
- Asia > India > Maharashtra > Mumbai (0.06)
- Asia > India > Karnataka > Bengaluru (0.05)
- Asia > India > Tamil Nadu > Chennai (0.05)
- (6 more...)
- Information Technology > Services (1.00)
- Banking & Finance (1.00)
- Automobiles & Trucks (1.00)
- (4 more...)
Impact on Jobs across Emerging Technologies During the Current Pandemic Crisis
Analytics India Magazine (AIM) along with Jigsaw Academy, has developed this study to focus on the impact on jobs across certain emerging technologies. Jigsaw Academy, with over 400 years of combined teaching experience, including online and remote learning delivery, is adept at training and upskilling professionals and freshers in key capabilities in emerging technologies like business analytics, data science, artificial intelligence, deep learning, cybersecurity, full stack development, and cloud computing, to name but a few. The broad Information Technology domain experienced significant growth and consolidation in 2019-2020. At the beginning of this year, various studies conducted by Analytics India Magazine indicated that the IT domain in general, and the specific domains of Artificial Intelligence, Deep Learning, Data Analytics, Machine Learning, and Cyber Security domains, to name a few, were experiencing significant growth in terms of revenues, investments, and salaries. Despite the lockdown and recessionary trends, specific domains and technologies across the IT space continue to develop at a steady space. The Covid pandemic has unfortunately affected the broader global and Indian economies – economic activity across the globe has slowed down after a strict lockdown in activity across all major economies. One of the other impacts of the disruption, due to the unfortunate recession and pandemic, is that there has been a shift of jobs and roles to Tier 2 and Tier 3 cities. Before the lockdown, a small percentage of job roles ( 3-4%) were advertised for the Tier 2 and Tier 3 cities – locations outside the IT, Technology, and BPO hubs. There has now been a significant shift to an average of about 8% of the jobs advertised in tier 2 and Tier 3 cities. This highlights that jobs are now increasingly becoming location independent and now advertised across several locations, including small cities and large towns.
- Information Technology > Security & Privacy (1.00)
- Education > Educational Setting > Online (0.66)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
- Health & Medicine > Therapeutic Area > Immunology (0.48)